34 research outputs found

    A multi-stage recurrent neural network better describes decision-related activity in dorsal premotor cortex

    Get PDF
    We studied how a network of recurrently connected artificial units solve a visual perceptual decision-making task. The goal of this task is to discriminate the dominant color of a central static checkerboard and report the decision with an arm movement. This task has been used to study neural activity in the dorsal premotor (PMd) cortex. When a single recurrent neural network (RNN) was trained to perform the task, the activity of artificial units in the RNN differed from neural recordings in PMd, suggesting that inputs to PMd differed from inputs to the RNN. We expanded our architecture and examined how a multi-stage RNN performed the task. In the multi-stage RNN, the last stage exhibited similarities with PMd by representing direction information but not color information. We then investigated how the representation of color and direction information evolve across RNN stages. Together, our results are a demonstration of the importance of incorporating architectural constraints into RNN models. These constraints can improve the ability of RNNs to model neural activity in association areas.https://doi.org/10.32470/CCN.2019.1123-0Accepted manuscrip

    Frequency shifts and depth dependence of premotor beta band activity during perceptual decision-making

    Get PDF
    Neural activity in the premotor and motor cortices shows prominent structure in the beta frequency range (13–30 Hz). Currently, the behavioral relevance of this beta band activity (BBA) is debated. The underlying source of motor BBA and how it changes as a function of cortical depth are also not completely understood. Here, we addressed these unresolved questions by investigating BBA recorded using laminar electrodes in the dorsal premotor cortex of 2 male rhesus macaques performing a visual reaction time (RT) reach discrimination task. We observed robust BBA before and after the onset of the visual stimulus but not during the arm movement. While poststimulus BBA was positively correlated with RT throughout the beta frequency range, prestimulus correlation varied by frequency. Low beta frequencies (∼12–20 Hz) were positively correlated with RT, and high beta frequencies (∼22–30 Hz) were negatively correlated with RT. Analysis and simulations suggested that these frequency-dependent correlations could emerge due to a shift in the component frequencies of the prestimulus BBA as a function of RT, such that faster RTs are accompanied by greater power in high beta frequencies. We also observed a laminar dependence of BBA, with deeper electrodes demonstrating stronger power in low beta frequencies both prestimulus and poststimulus. The heterogeneous nature of BBA and the changing relationship between BBA and RT in different task epochs may be a sign of the differential network dynamics involved in cue expectation, decision-making, motor preparation, and movement execution.Published versio

    Attentional Networks and Biological Motion

    Get PDF
    Our ability to see meaningful actions when presented with pointlight traces of human movement is commonly referred to as the perception of biological motion. While traditionalexplanations have emphasized the spontaneous and automatic nature of this ability, morerecent findings suggest that attention may play a larger role than is typically assumed. Intwo studies we show that the speed and accuracy of responding to point-light stimuli is highly correlated with the ability to control selective attention. In our first experiment we measured thresholds for determining the walking direction of a masked point-light figure, and performance on a range of attention-related tasks in the same set of observers. Mask-density thresholds for the direction discrimination task varied quite considerably from observer to observer and this variation was highly correlated with performance on both Stroop and flanker interference tasks. Other components of attention, such as orienting, alerting and visual search efficiency, showed no such relationship. In a second experiment, we examined the relationship between the ability to determine the orientation of unmasked point-light actions and Stroop interference, again finding a strong correlation. Our results are consistent with previous research suggesting that biological motion processing may requite attention, and specifically implicate networks of attention related to executive control and selection

    A mechanistic multi-area recurrent network model of decision-making

    Full text link
    Recurrent neural networks (RNNs) trained on neuroscience-based tasks have been widely used as models for cortical areas performing analogous tasks. However, very few tasks involve a single cortical area, and instead require the coordination of multiple brain areas. Despite the importance of multi-area computation, there is a limited understanding of the principles underlying such computation. We propose to use multi-area RNNs with neuroscience-inspired architecture constraints to derive key features of multi-area computation. In particular, we show that incorpo- rating multiple areas and Dale’s Law is critical for biasing the networks to learn biologically plausible solutions. Additionally, we leverage the full observability of the RNNs to show that output-relevant information is preferentially propagated between areas. These results suggest that cortex uses modular computation to generate minimal sufficient representations of task information. More broadly, our results suggest that constrained multi-area RNNs can produce experimentally testable hypotheses for computations that occur within and across multiple brain areas, enabling new insights into distributed computation in neural systems.https://proceedings.neurips.cc/paper/2021/file/c2f599841f21aaefeeabd2a60ef7bfe8-Paper.pdfPublished versio

    Non-linear dimensionality reduction on extracellular waveforms reveals cell type diversity in premotor cortex

    Get PDF
    Cortical circuits are thought to contain a large number of cell types that coordinate to produce behavior. Current in vivo methods rely on clustering of specified features of extracellular waveforms to identify putative cell types, but these capture only a small amount of variation. Here, we develop a new method (WaveMAP) that combines non-linear dimensionality reduction with graph clustering to identify putative cell types. We apply WaveMAP to extracellular waveforms recorded from dorsal premotor cortex of macaque monkeys performing a decision-making task. Using WaveMAP, we robustly establish eight waveform clusters and show that these clusters recapitulate previously identified narrow- and broad-spiking types while revealing previously unknown diversity within these subtypes. The eight clusters exhibited distinct laminar distributions, characteristic firing rate patterns, and decision-related dynamics. Such insights were weaker when using feature-based approaches. WaveMAP therefore provides a more nuanced understanding of the dynamics of cell types in cortical circuits.https://elifesciences.org/articles/67490Published versio

    ChaRTr: An R toolbox for modeling choices and response times in decision-making tasks

    Get PDF
    Background Decision-making is the process of choosing and performing actions in response to sensory cues to achieve behavioral goals. Many mathematical models have been developed to describe the choice behavior and response time (RT) distributions of observers performing decision-making tasks. However, relatively few researchers use these models because it demands expertise in various numerical, statistical, and software techniques. New method We present a toolbox — Choices and Response Times in R, or ChaRTr — that provides the user the ability to implement and test a wide variety of decision-making models ranging from classic through to modern versions of the diffusion decision model, to models with urgency signals, or collapsing boundaries. Results In three different case studies, we demonstrate how ChaRTr can be used to effortlessly discriminate between multiple models of decision-making behavior. We also provide guidance on how to extend the toolbox to incorporate future developments in decision-making models. Comparison with existing method(s) Existing software packages surmounted some of the numerical issues but have often focused on the classical decision-making model, the diffusion decision model. Recent models that posit roles for urgency, time-varying decision thresholds, noise in various aspects of the decision-formation process or low pass filtering of sensory evidence have proven to be challenging to incorporate in a coherent software framework that permits quantitative evaluation among these competing classes of decision-making models. Conclusion ChaRTr can be used to make insightful statements about the cognitive processes underlying observed decision-making behavior and ultimately for deeper insights into decision mechanisms.https://www.sciencedirect.com/science/article/pii/S0165027019302894Published versio

    Monkeys and Humans Share a Common Computation for Face/Voice Integration

    Get PDF
    Speech production involves the movement of the mouth and other regions of the face resulting in visual motion cues. These visual cues enhance intelligibility and detection of auditory speech. As such, face-to-face speech is fundamentally a multisensory phenomenon. If speech is fundamentally multisensory, it should be reflected in the evolution of vocal communication: similar behavioral effects should be observed in other primates. Old World monkeys share with humans vocal production biomechanics and communicate face-to-face with vocalizations. It is unknown, however, if they, too, combine faces and voices to enhance their perception of vocalizations. We show that they do: monkeys combine faces and voices in noisy environments to enhance their detection of vocalizations. Their behavior parallels that of humans performing an identical task. We explored what common computational mechanism(s) could explain the pattern of results we observed across species. Standard explanations or models such as the principle of inverse effectiveness and a “race” model failed to account for their behavior patterns. Conversely, a “superposition model”, positing the linear summation of activity patterns in response to visual and auditory components of vocalizations, served as a straightforward but powerful explanatory mechanism for the observed behaviors in both species. As such, it represents a putative homologous mechanism for integrating faces and voices across primates

    The Natural Statistics of Audiovisual Speech

    Get PDF
    Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2–7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver

    Different Neural Frequency Bands Integrate Faces and Voices Differently in the Superior Temporal Sulcus

    No full text
    The integration of auditory and visual information is required for the default mode of speech–face-to-face communication. As revealed by functional magnetic resonance imaging and electrophysiological studies, the regions in and around the superior temporal sulcus (STS) are implicated in this process. To provide greater insights into the network-level dynamics of the STS during audiovisual integration, we used a macaque model system to analyze the different frequency bands of local field potential (LFP) responses to the auditory and visual components of vocalizations. These vocalizations (like human speech) have a natural time delay between the onset of visible mouth movements and the onset of the voice (the “time-to-voice” or TTV). We show that the LFP responses to faces and voices elicit distinct bands of activity in the theta (4–8 Hz), alpha (8–14 Hz), and gamma (>40 Hz) frequency ranges. Along with single neuron responses, the gamma band activity was greater for face stimuli than voice stimuli. Surprisingly, the opposite was true for the low-frequency bands—auditory responses were of a greater magnitude. Furthermore, gamma band responses in STS were sustained for dynamic faces but not so for voices (the opposite is true for auditory cortex). These data suggest that visual and auditory stimuli are processed in fundamentally different ways in the STS. Finally, we show that the three bands integrate faces and voices differently: theta band activity showed weak multisensory behavior regardless of TTV, the alpha band activity was enhanced for calls with short TTVs but showed little integration for longer TTVs, and finally, the gamma band activity was consistently enhanced for all TTVs. These data demonstrate that LFP activity from the STS can be segregated into distinct frequency bands which integrate audiovisual communication signals in an independent manner. These different bands may reflect different spatial scales of network processing during face-to-face communication
    corecore